-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn on usage of Maps.transformValues #2518
base: develop
Are you sure you want to change the base?
Conversation
Generate changelog in
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I agree with this -- it's not so much an unexpected detail as the way that a common guava utility function works (and is documented). In most places that I'm aware of where this is used, this behavior is relied upon to avoid producing expensive intermediate objects.
We generally add checks like this when behavior has side-effects or edge cases that are wildly unexpected and dangerous, and the vast majority of usages in our codebases are provably unsound. I don't think this meets the bar for forcing hundreds or even thousands of code changes to be added.
I hear your point. It is documented, yet ambiguously named and easy to miss. It just caused a pretty bad P0 last week and I know of at least one more time it was erroneously used. Sure, Errorprone can't safeguard against all the possible bugs, but I do expect this to happen more often and wonder what we can do. I'm trying to think of a way to make it more specific, but struggle. Maybe, i.e., if we scan for places where transformValues is directly assigned to a variable of type Map, that'd narrow it down a little. Do you have any advice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this can be an ERROR
level check as there is no automated fix (nor should there be), and this change would fail on rollout to many repos requiring broad suppressions or disabling of the check.
@BugPattern( | ||
link = "https://github.com/palantir/gradle-baseline#baseline-error-prone-checks", | ||
linkType = BugPattern.LinkType.CUSTOM, | ||
severity = SeverityLevel.ERROR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I understand the issue that spurred this PR, I think this should be a SUGGESTION
as there are many uses of Maps#transformValues
that I would not want to robotically replace with eager transformations.
There are many places where we very intentionally use Maps.transformValues to avoid intermediate collection copies (due to both allocations, CPU, and memory tradeoffs) in places where we're handling large volumes of transformations. Some public examples would be AtlasDB's use throughout much of its data processing.
private static final String ERROR_MESSAGE = "The transformValues API of Guava Maps creates a lazily evaluated " | ||
+ "view of the source Map. Repeated access of the same key leads to repeated evaluations of the " | ||
+ "transformer function. This is often unintended and can cause severe performance degradation." | ||
+ "Where this is actually intended, suppress this warning."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, I'm hesitant to treat lazy transforms as something that completely needs to be avoided or surprised. If we were to do that, do we block all the other lazy APIs from Guava (e.g. Iterators/(Fluent)Iterables/Collections2/Lists/Sets/Maps
transform* methods)? Do we forbid returning Stream
from methods as these are lazy until terminal operation? Do we forbid all Guava APIs that return collection/map views?
I would argue no we should not forbid these to all of the above. That said, in general most places should probably prefer eager transforms by default (especially as it is simple to stream & collect into an Immutable*
or unmodifiable collection), but I don't think it is an error to use these well documented APIs as they were designed.
implements BugChecker.MethodInvocationTreeMatcher { | ||
private static final long serialVersionUID = 1L; | ||
private static final String ERROR_MESSAGE = "The transformValues API of Guava Maps creates a lazily evaluated " | ||
+ "view of the source Map. Repeated access of the same key leads to repeated evaluations of the " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how feasible this would be, but a more specifically targeted checker around reuse of lazily transformed collections might lead to higher signal warnings. Possibly couple this with checks identified RPCs in lazy evaluation might also be high signal (as those may be places that should be utilizing a batch API if available).
Before this PR
After this PR
==COMMIT_MSG==
Warn on usage of Maps.transformValues
==COMMIT_MSG==
Possible downsides?